Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Nonconvex Stochastic Optimization: Nonasymptotic Performance Bounds and Momentum-Based Acceleration

نویسندگان

چکیده

Nonconvex Stochastic Optimization stochastic optimization problems arise in many machine learning problems, including deep learning. The gradient Hamiltonian Monte Carlo (SGHMC) is a variant of gradients with momentum method which controlled and properly scaled Gaussian noise added to the steer iterates toward global minimum. SGHMC has shown empirical success practice for solving nonconvex problems. In “Global convergence optimization: Nonasymptotic performance bounds momentum-based acceleration,” Gao, Gürbüzbalaban, Zhu provide, first time, finite-time context both population risk minimization show that acceleration possible optimization.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Gradient Hamiltonian Monte Carlo

Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a MetropolisHastings framework, enabling more efficient exploration of the state space than standard random-walk proposals. The popularity of such methods has grown significantly in recent years. However, a limitation of HMC methods is the required gradient com...

متن کامل

Stochastic Gradient Hamiltonian Monte Carlo

Supplementary Material A. Background on Fokker-Planck Equation The Fokker-Planck equation (FPE) associated with a given stochastic differential equation (SDE) describes the time evolution of the distribution on the random variables under the specified stochastic dynamics. For example, consider the SDE: dz = g(z)dt+N (0, 2D(z)dt), (16) where z ∈ R, g(z) ∈ R, D(z) ∈ Rn×n. The distribution of z go...

متن کامل

Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization

Asynchronous parallel implementations of stochastic gradient (SG) have been broadly used in solving deep neural network and received many successes in practice recently. However, existing theories cannot explain their convergence and speedup properties, mainly due to the nonconvexity of most deep learning formulations and the asynchronous parallel mechanism. To fill the gaps in theory and provi...

متن کامل

Stochastic Recursive Gradient Algorithm for Nonconvex Optimization

In this paper, we study and analyze the mini-batch version of StochAstic Recursive grAdient algoritHm (SARAH), a method employing the stochastic recursive gradient, for solving empirical loss minimization for the case of nonconvex losses. We provide a sublinear convergence rate (to stationary points) for general nonconvex functions and a linear convergence rate for gradient dominated functions,...

متن کامل

Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization

A. Proof of Theorem 1 We first recall the following lemma. Lemma 1 (Lemma 1, (Gong et al., 2013)). Under Assumption 1.{3}. For any η > 0 and any x,y ∈ R such that x = proxηg(y − η∇f(y)), one has that F (x) ≤ F (y)− ( 1 2η − L 2 )‖x− y‖ . Applying Lemma 1 with x = xk,y = yk, we obtain that F (xk) ≤ F (yk)− ( 1 2η − L 2 )‖xk − yk‖ . (12) Since η < 1 L , it follows that F (xk) ≤ F (yk). Moreover, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Operations Research

سال: 2022

ISSN: ['1526-5463', '0030-364X']

DOI: https://doi.org/10.1287/opre.2021.2162